Overview

Brought to you by YData

Dataset statistics

Number of variables9
Number of observations11497554
Missing cells862
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory789.5 MiB
Average record size in memory72.0 B

Variable types

Text7
Categorical2

Alerts

titleType is highly imbalanced (61.9%) Imbalance
isAdult is highly imbalanced (96.2%) Imbalance
tconst has unique values Unique

Reproduction

Analysis started2025-03-06 14:07:41.634718
Analysis finished2025-03-06 14:23:33.653208
Duration15 minutes and 52.02 seconds
Software versionydata-profiling vv4.13.0
Download configurationconfig.json

Variables

tconst
Text

Unique 

Distinct11497554
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size87.7 MiB
2025-03-06T09:23:50.238838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.5100721
Min length9

Characters and Unicode

Total characters109342567
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11497554 ?
Unique (%)100.0%

Sample

1st rowtt0000001
2nd rowtt0000002
3rd rowtt0000003
4th rowtt0000004
5th rowtt0000005
ValueCountFrequency (%)
tt0000003 1
 
< 0.1%
tt9916880 1
 
< 0.1%
tt9916824 1
 
< 0.1%
tt9916826 1
 
< 0.1%
tt9916830 1
 
< 0.1%
tt9916832 1
 
< 0.1%
tt9916834 1
 
< 0.1%
tt9916836 1
 
< 0.1%
tt9916838 1
 
< 0.1%
tt9916840 1
 
< 0.1%
Other values (11497544) 11497544
> 99.9%
2025-03-06T09:24:01.220360image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 22995108
21.0%
1 11178421
10.2%
2 10595346
9.7%
0 9382735
8.6%
4 8732037
 
8.0%
3 8662308
 
7.9%
8 8403278
 
7.7%
6 8375926
 
7.7%
5 7263202
 
6.6%
7 6942274
 
6.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 109342567
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 22995108
21.0%
1 11178421
10.2%
2 10595346
9.7%
0 9382735
8.6%
4 8732037
 
8.0%
3 8662308
 
7.9%
8 8403278
 
7.7%
6 8375926
 
7.7%
5 7263202
 
6.6%
7 6942274
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 109342567
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 22995108
21.0%
1 11178421
10.2%
2 10595346
9.7%
0 9382735
8.6%
4 8732037
 
8.0%
3 8662308
 
7.9%
8 8403278
 
7.7%
6 8375926
 
7.7%
5 7263202
 
6.6%
7 6942274
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 109342567
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 22995108
21.0%
1 11178421
10.2%
2 10595346
9.7%
0 9382735
8.6%
4 8732037
 
8.0%
3 8662308
 
7.9%
8 8403278
 
7.7%
6 8375926
 
7.7%
5 7263202
 
6.6%
7 6942274
 
6.3%

titleType
Categorical

Imbalance 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.7 MiB
tvEpisode
8842179 
short
1047689 
movie
 
708125
video
 
306986
tvSeries
 
277985
Other values (6)
 
314590

Length

Max length12
Median length9
Mean length8.2459255
Min length5

Characters and Unicode

Total characters94807974
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowshort
2nd rowshort
3rd rowshort
4th rowshort
5th rowshort

Common Values

ValueCountFrequency (%)
tvEpisode 8842179
76.9%
short 1047689
 
9.1%
movie 708125
 
6.2%
video 306986
 
2.7%
tvSeries 277985
 
2.4%
tvMovie 150130
 
1.3%
tvMiniSeries 60199
 
0.5%
tvSpecial 51496
 
0.4%
videoGame 42183
 
0.4%
tvShort 10581
 
0.1%

Length

2025-03-06T09:24:01.371534image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tvepisode 8842179
76.9%
short 1047689
 
9.1%
movie 708125
 
6.2%
video 306986
 
2.7%
tvseries 277985
 
2.4%
tvmovie 150130
 
1.3%
tvminiseries 60199
 
0.5%
tvspecial 51496
 
0.4%
videogame 42183
 
0.4%
tvshort 10581
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o 11107874
11.7%
e 10819650
11.4%
v 10599995
11.2%
i 10559682
11.1%
t 10450842
11.0%
s 10228052
10.8%
d 9191348
9.7%
p 8893675
9.4%
E 8842179
9.3%
r 1396454
 
1.5%
Other values (10) 2718223
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 94807974
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 11107874
11.7%
e 10819650
11.4%
v 10599995
11.2%
i 10559682
11.1%
t 10450842
11.0%
s 10228052
10.8%
d 9191348
9.7%
p 8893675
9.4%
E 8842179
9.3%
r 1396454
 
1.5%
Other values (10) 2718223
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 94807974
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 11107874
11.7%
e 10819650
11.4%
v 10599995
11.2%
i 10559682
11.1%
t 10450842
11.0%
s 10228052
10.8%
d 9191348
9.7%
p 8893675
9.4%
E 8842179
9.3%
r 1396454
 
1.5%
Other values (10) 2718223
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 94807974
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 11107874
11.7%
e 10819650
11.4%
v 10599995
11.2%
i 10559682
11.1%
t 10450842
11.0%
s 10228052
10.8%
d 9191348
9.7%
p 8893675
9.4%
E 8842179
9.3%
r 1396454
 
1.5%
Other values (10) 2718223
 
2.9%
Distinct5169977
Distinct (%)45.0%
Missing19
Missing (%)< 0.1%
Memory size87.7 MiB
2025-03-06T09:24:09.525155image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length458
Median length405
Mean length19.86731
Min length1

Characters and Unicode

Total characters228425088
Distinct characters203
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4713837 ?
Unique (%)41.0%

Sample

1st rowCarmencita
2nd rowLe clown et ses chiens
3rd rowPoor Pierrot
4th rowUn bon bock
5th rowBlacksmith Scene
ValueCountFrequency (%)
episode 4829829
 
12.7%
the 1176720
 
3.1%
dated 940908
 
2.5%
459714
 
1.2%
of 404569
 
1.1%
a 321853
 
0.8%
and 254404
 
0.7%
in 232228
 
0.6%
to 190171
 
0.5%
2 150513
 
0.4%
Other values (1413980) 29071058
76.4%
2025-03-06T09:24:11.711266image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26533623
 
11.6%
e 20063100
 
8.8%
i 13155102
 
5.8%
o 12971854
 
5.7%
a 11418055
 
5.0%
s 11183018
 
4.9%
d 10121912
 
4.4%
r 8223486
 
3.6%
t 8136502
 
3.6%
n 8110022
 
3.6%
Other values (193) 98508414
43.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 228425088
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
26533623
 
11.6%
e 20063100
 
8.8%
i 13155102
 
5.8%
o 12971854
 
5.7%
a 11418055
 
5.0%
s 11183018
 
4.9%
d 10121912
 
4.4%
r 8223486
 
3.6%
t 8136502
 
3.6%
n 8110022
 
3.6%
Other values (193) 98508414
43.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 228425088
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
26533623
 
11.6%
e 20063100
 
8.8%
i 13155102
 
5.8%
o 12971854
 
5.7%
a 11418055
 
5.0%
s 11183018
 
4.9%
d 10121912
 
4.4%
r 8223486
 
3.6%
t 8136502
 
3.6%
n 8110022
 
3.6%
Other values (193) 98508414
43.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 228425088
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
26533623
 
11.6%
e 20063100
 
8.8%
i 13155102
 
5.8%
o 12971854
 
5.7%
a 11418055
 
5.0%
s 11183018
 
4.9%
d 10121912
 
4.4%
r 8223486
 
3.6%
t 8136502
 
3.6%
n 8110022
 
3.6%
Other values (193) 98508414
43.1%
Distinct5194979
Distinct (%)45.2%
Missing19
Missing (%)< 0.1%
Memory size87.7 MiB
2025-03-06T09:24:15.531628image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length458
Median length405
Mean length19.864735
Min length1

Characters and Unicode

Total characters228395487
Distinct characters190
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4739045 ?
Unique (%)41.2%

Sample

1st rowCarmencita
2nd rowLe clown et ses chiens
3rd rowPauvre Pierrot
4th rowUn bon bock
5th rowBlacksmith Scene
ValueCountFrequency (%)
episode 4829765
 
12.7%
the 1124106
 
3.0%
dated 940907
 
2.5%
460722
 
1.2%
of 383798
 
1.0%
a 313525
 
0.8%
and 247633
 
0.7%
in 226191
 
0.6%
to 187797
 
0.5%
de 151446
 
0.4%
Other values (1450862) 29142850
76.7%
2025-03-06T09:24:17.800447image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26511220
 
11.6%
e 20007205
 
8.8%
i 13203158
 
5.8%
o 12963151
 
5.7%
a 11499813
 
5.0%
s 11188948
 
4.9%
d 10124757
 
4.4%
r 8199996
 
3.6%
n 8137510
 
3.6%
t 8099495
 
3.5%
Other values (180) 98460234
43.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 228395487
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
26511220
 
11.6%
e 20007205
 
8.8%
i 13203158
 
5.8%
o 12963151
 
5.7%
a 11499813
 
5.0%
s 11188948
 
4.9%
d 10124757
 
4.4%
r 8199996
 
3.6%
n 8137510
 
3.6%
t 8099495
 
3.5%
Other values (180) 98460234
43.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 228395487
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
26511220
 
11.6%
e 20007205
 
8.8%
i 13203158
 
5.8%
o 12963151
 
5.7%
a 11499813
 
5.0%
s 11188948
 
4.9%
d 10124757
 
4.4%
r 8199996
 
3.6%
n 8137510
 
3.6%
t 8099495
 
3.5%
Other values (180) 98460234
43.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 228395487
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
26511220
 
11.6%
e 20007205
 
8.8%
i 13203158
 
5.8%
o 12963151
 
5.7%
a 11499813
 
5.0%
s 11188948
 
4.9%
d 10124757
 
4.4%
r 8199996
 
3.6%
n 8137510
 
3.6%
t 8099495
 
3.5%
Other values (180) 98460234
43.1%

isAdult
Categorical

Imbalance 

Distinct45
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.7 MiB
0
11125251 
1
 
371479
1978
 
130
1985
 
83
1980
 
66
Other values (40)
 
545

Length

Max length4
Median length1
Mean length1.0002148
Min length1

Characters and Unicode

Total characters11500024
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 11125251
96.8%
1 371479
 
3.2%
1978 130
 
< 0.1%
1985 83
 
< 0.1%
1980 66
 
< 0.1%
1979 63
 
< 0.1%
1984 41
 
< 0.1%
1974 33
 
< 0.1%
1982 32
 
< 0.1%
1972 29
 
< 0.1%
Other values (35) 347
 
< 0.1%

Length

2025-03-06T09:24:17.919946image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 11125251
96.8%
1 371479
 
3.2%
1978 130
 
< 0.1%
1985 83
 
< 0.1%
1980 66
 
< 0.1%
1979 63
 
< 0.1%
1984 41
 
< 0.1%
1974 33
 
< 0.1%
1982 32
 
< 0.1%
1972 29
 
< 0.1%
Other values (35) 347
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 11125451
96.7%
1 372300
 
3.2%
9 780
 
< 0.1%
8 469
 
< 0.1%
7 392
 
< 0.1%
2 208
 
< 0.1%
6 146
 
< 0.1%
5 133
 
< 0.1%
4 85
 
< 0.1%
3 58
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11500024
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 11125451
96.7%
1 372300
 
3.2%
9 780
 
< 0.1%
8 469
 
< 0.1%
7 392
 
< 0.1%
2 208
 
< 0.1%
6 146
 
< 0.1%
5 133
 
< 0.1%
4 85
 
< 0.1%
3 58
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11500024
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 11125451
96.7%
1 372300
 
3.2%
9 780
 
< 0.1%
8 469
 
< 0.1%
7 392
 
< 0.1%
2 208
 
< 0.1%
6 146
 
< 0.1%
5 133
 
< 0.1%
4 85
 
< 0.1%
3 58
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11500024
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 11125451
96.7%
1 372300
 
3.2%
9 780
 
< 0.1%
8 469
 
< 0.1%
7 392
 
< 0.1%
2 208
 
< 0.1%
6 146
 
< 0.1%
5 133
 
< 0.1%
4 85
 
< 0.1%
3 58
 
< 0.1%
Other values (2) 2
 
< 0.1%
Distinct152
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.7 MiB
2025-03-06T09:24:18.152898image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.7521087
Min length2

Characters and Unicode

Total characters43140072
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row1894
2nd row1892
3rd row1892
4th row1892
5th row1893
ValueCountFrequency (%)
n 1425072
 
12.4%
2021 507756
 
4.4%
2022 490371
 
4.3%
2018 457557
 
4.0%
2023 454467
 
4.0%
2019 453971
 
3.9%
2017 451182
 
3.9%
2020 435636
 
3.8%
2016 426996
 
3.7%
2015 401774
 
3.5%
Other values (142) 5992772
52.1%
2025-03-06T09:24:18.565411image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 11306873
26.2%
0 10422710
24.2%
1 7298469
16.9%
9 3997798
 
9.3%
\ 1425072
 
3.3%
N 1425072
 
3.3%
8 1406997
 
3.3%
7 1300422
 
3.0%
3 1189585
 
2.8%
4 1173709
 
2.7%
Other values (2) 2193365
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43140072
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 11306873
26.2%
0 10422710
24.2%
1 7298469
16.9%
9 3997798
 
9.3%
\ 1425072
 
3.3%
N 1425072
 
3.3%
8 1406997
 
3.3%
7 1300422
 
3.0%
3 1189585
 
2.8%
4 1173709
 
2.7%
Other values (2) 2193365
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43140072
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 11306873
26.2%
0 10422710
24.2%
1 7298469
16.9%
9 3997798
 
9.3%
\ 1425072
 
3.3%
N 1425072
 
3.3%
8 1406997
 
3.3%
7 1300422
 
3.0%
3 1189585
 
2.8%
4 1173709
 
2.7%
Other values (2) 2193365
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43140072
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 11306873
26.2%
0 10422710
24.2%
1 7298469
16.9%
9 3997798
 
9.3%
\ 1425072
 
3.3%
N 1425072
 
3.3%
8 1406997
 
3.3%
7 1300422
 
3.0%
3 1189585
 
2.8%
4 1173709
 
2.7%
Other values (2) 2193365
 
5.1%
Distinct97
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.7 MiB
2025-03-06T09:24:18.804682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0238319
Min length1

Characters and Unicode

Total characters23269117
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row\N
2nd row\N
3rd row\N
4th row\N
5th row\N
ValueCountFrequency (%)
n 11360548
98.8%
2019 7352
 
0.1%
2018 7252
 
0.1%
2017 7182
 
0.1%
2020 7157
 
0.1%
2021 7019
 
0.1%
2022 6577
 
0.1%
2023 6000
 
0.1%
2016 5741
 
< 0.1%
2024 4968
 
< 0.1%
Other values (87) 77758
 
0.7%
2025-03-06T09:24:19.146583image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
\ 11360548
48.8%
N 11360548
48.8%
2 151769
 
0.7%
0 139739
 
0.6%
1 97384
 
0.4%
9 60023
 
0.3%
8 22145
 
0.1%
7 18925
 
0.1%
6 15490
 
0.1%
3 14663
 
0.1%
Other values (2) 27883
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23269117
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
\ 11360548
48.8%
N 11360548
48.8%
2 151769
 
0.7%
0 139739
 
0.6%
1 97384
 
0.4%
9 60023
 
0.3%
8 22145
 
0.1%
7 18925
 
0.1%
6 15490
 
0.1%
3 14663
 
0.1%
Other values (2) 27883
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23269117
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
\ 11360548
48.8%
N 11360548
48.8%
2 151769
 
0.7%
0 139739
 
0.6%
1 97384
 
0.4%
9 60023
 
0.3%
8 22145
 
0.1%
7 18925
 
0.1%
6 15490
 
0.1%
3 14663
 
0.1%
Other values (2) 27883
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23269117
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
\ 11360548
48.8%
N 11360548
48.8%
2 151769
 
0.7%
0 139739
 
0.6%
1 97384
 
0.4%
9 60023
 
0.3%
8 22145
 
0.1%
7 18925
 
0.1%
6 15490
 
0.1%
3 14663
 
0.1%
Other values (2) 27883
 
0.1%
Distinct958
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.7 MiB
2025-03-06T09:24:19.504159image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length23
Median length2
Mean length1.9859534
Min length1

Characters and Unicode

Total characters22833606
Distinct characters43
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique280 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row5
3rd row5
4th row12
5th row1
ValueCountFrequency (%)
n 7844505
68.2%
30 340204
 
3.0%
60 255478
 
2.2%
22 198493
 
1.7%
45 105243
 
0.9%
15 102327
 
0.9%
25 82379
 
0.7%
44 82274
 
0.7%
23 76354
 
0.7%
10 76343
 
0.7%
Other values (948) 2333954
 
20.3%
2025-03-06T09:24:19.923702image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 7844508
34.4%
\ 7844505
34.4%
2 1210172
 
5.3%
0 1096535
 
4.8%
1 1008872
 
4.4%
3 777354
 
3.4%
4 768323
 
3.4%
5 725104
 
3.2%
6 525085
 
2.3%
8 371123
 
1.6%
Other values (33) 662025
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22833606
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 7844508
34.4%
\ 7844505
34.4%
2 1210172
 
5.3%
0 1096535
 
4.8%
1 1008872
 
4.4%
3 777354
 
3.4%
4 768323
 
3.4%
5 725104
 
3.2%
6 525085
 
2.3%
8 371123
 
1.6%
Other values (33) 662025
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22833606
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 7844508
34.4%
\ 7844505
34.4%
2 1210172
 
5.3%
0 1096535
 
4.8%
1 1008872
 
4.4%
3 777354
 
3.4%
4 768323
 
3.4%
5 725104
 
3.2%
6 525085
 
2.3%
8 371123
 
1.6%
Other values (33) 662025
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22833606
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 7844508
34.4%
\ 7844505
34.4%
2 1210172
 
5.3%
0 1096535
 
4.8%
1 1008872
 
4.4%
3 777354
 
3.4%
4 768323
 
3.4%
5 725104
 
3.2%
6 525085
 
2.3%
8 371123
 
1.6%
Other values (33) 662025
 
2.9%

genres
Text

Distinct2385
Distinct (%)< 0.1%
Missing824
Missing (%)< 0.1%
Memory size87.7 MiB
2025-03-06T09:24:20.122805image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length32
Median length28
Mean length10.943875
Min length2

Characters and Unicode

Total characters125818780
Distinct characters37
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique213 ?
Unique (%)< 0.1%

Sample

1st rowDocumentary,Short
2nd rowAnimation,Short
3rd rowAnimation,Comedy,Romance
4th rowAnimation,Short
5th rowShort
ValueCountFrequency (%)
drama 1298950
 
11.3%
comedy 751748
 
6.5%
talk-show 725249
 
6.3%
news 598679
 
5.2%
documentary 554505
 
4.8%
drama,romance 524304
 
4.6%
n 506043
 
4.4%
reality-tv 365972
 
3.2%
adult 314492
 
2.7%
news,talk-show 260313
 
2.3%
Other values (2375) 5596475
48.7%
2025-03-06T09:24:20.482014image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 13326852
 
10.6%
m 9982927
 
7.9%
o 9576901
 
7.6%
r 8418910
 
6.7%
e 8413216
 
6.7%
, 6840408
 
5.4%
y 5834103
 
4.6%
t 5794351
 
4.6%
i 4832789
 
3.8%
n 4507740
 
3.6%
Other values (27) 48290583
38.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 125818780
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 13326852
 
10.6%
m 9982927
 
7.9%
o 9576901
 
7.6%
r 8418910
 
6.7%
e 8413216
 
6.7%
, 6840408
 
5.4%
y 5834103
 
4.6%
t 5794351
 
4.6%
i 4832789
 
3.8%
n 4507740
 
3.6%
Other values (27) 48290583
38.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 125818780
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 13326852
 
10.6%
m 9982927
 
7.9%
o 9576901
 
7.6%
r 8418910
 
6.7%
e 8413216
 
6.7%
, 6840408
 
5.4%
y 5834103
 
4.6%
t 5794351
 
4.6%
i 4832789
 
3.8%
n 4507740
 
3.6%
Other values (27) 48290583
38.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 125818780
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 13326852
 
10.6%
m 9982927
 
7.9%
o 9576901
 
7.6%
r 8418910
 
6.7%
e 8413216
 
6.7%
, 6840408
 
5.4%
y 5834103
 
4.6%
t 5794351
 
4.6%
i 4832789
 
3.8%
n 4507740
 
3.6%
Other values (27) 48290583
38.4%

Correlations

2025-03-06T09:24:20.585961image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
isAdulttitleType
isAdult1.0000.097
titleType0.0971.000

Missing values

2025-03-06T09:22:42.513080image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-06T09:22:52.393526image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-03-06T09:23:10.368100image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

tconsttitleTypeprimaryTitleoriginalTitleisAdultstartYearendYearruntimeMinutesgenres
0tt0000001shortCarmencitaCarmencita01894\N1Documentary,Short
1tt0000002shortLe clown et ses chiensLe clown et ses chiens01892\N5Animation,Short
2tt0000003shortPoor PierrotPauvre Pierrot01892\N5Animation,Comedy,Romance
3tt0000004shortUn bon bockUn bon bock01892\N12Animation,Short
4tt0000005shortBlacksmith SceneBlacksmith Scene01893\N1Short
5tt0000006shortChinese Opium DenChinese Opium Den01894\N1Short
6tt0000007shortCorbett and Courtney Before the KinetographCorbett and Courtney Before the Kinetograph01894\N1Short,Sport
7tt0000008shortEdison Kinetoscopic Record of a SneezeEdison Kinetoscopic Record of a Sneeze01894\N1Documentary,Short
8tt0000009movieMiss JerryMiss Jerry01894\N45Romance
9tt0000010shortLeaving the FactoryLa sortie de l'usine Lumière à Lyon01895\N1Documentary,Short
tconsttitleTypeprimaryTitleoriginalTitleisAdultstartYearendYearruntimeMinutesgenres
11497544tt9916838tvEpisodeEpisode #3.13Episode #3.1302009\N\NDrama
11497545tt9916840tvEpisodeHorrid Henry's Comic CaperHorrid Henry's Comic Caper02014\N11Adventure,Animation,Comedy
11497546tt9916842tvEpisodeEpisode #3.16Episode #3.1602009\N\NDrama
11497547tt9916844tvEpisodeEpisode #3.15Episode #3.1502009\N\NDrama
11497548tt9916846tvEpisodeEpisode #3.18Episode #3.1802009\N\NDrama
11497549tt9916848tvEpisodeEpisode #3.17Episode #3.1702009\N\NDrama
11497550tt9916850tvEpisodeEpisode #3.19Episode #3.1902010\N\NDrama
11497551tt9916852tvEpisodeEpisode #3.20Episode #3.2002010\N\NDrama
11497552tt9916856shortThe WindThe Wind02015\N27Short
11497553tt9916880tvEpisodeHorrid Henry Knows It AllHorrid Henry Knows It All02014\N10Adventure,Animation,Comedy